An Incremental Learning Based Framework for Image Spam Filtering
نویسندگان
چکیده
Nowadays, an image spam is an unsolved problem because of two reasons. One is due to the diversity of spamming tricks. The other reason is due to the evolving nature of image spam. As new spam constantly emerging, filters’ effectiveness drops over time. In this paper, we present an effective anti-spam approach to solve the two problems. First, a novel clustering filter is proposed. By exploring the density-based clustering algorithm, the proposed filter is robust to spamming tricks. Then, we present a hierarchical framework by combining the clustering filter with other machine learning based classifiers to further improve the filtering capacity. Moreover, incremental learning mechanism is integrated to ensure the proposed framework be capable of adjusting itself to overcome new image spamming tricks. We evaluate the proposed framework on two public spam corpora. The experiment results show that the proposed framework achieves high precision along with low false positive rate.
منابع مشابه
An Anti-spam Filter Combination Framework for Text-and-Image Emails through Incremental Learning
We present an anti-spam filtering framework that combines text-based and image-based anti-spam filters. First, an incremental learning approach to reducing mismatches between training and test datasets is proposed to resolve the problem of a lack of training data for legitimate emails that contain both text and images. Then, the outputs of text-based and image-based filters are combined with th...
متن کاملA Hybrid Framework for Building an Efficient Incremental Intrusion Detection System
In this paper, a boosting-based incremental hybrid intrusion detection system is introduced. This system combines incremental misuse detection and incremental anomaly detection. We use boosting ensemble of weak classifiers to implement misuse intrusion detection system. It can identify new classes types of intrusions that do not exist in the training dataset for incremental misuse detection. As...
متن کاملDetecting Image Spam Using Image Texture Features
Filtering image email spam is considered to be a challenging problem because spammers keep modifying the images being used in their campaigns by employing different obfuscation techniques. Therefore, preventing text recognition using Optical Character Recognition (OCR) tools and imposing additional challenges in filtering such type of spam. In this paper, we propose an image spam filtering tech...
متن کاملAn efficient incremental learning mechanism for tracking concept drift in spam filtering
This research manages in-depth analysis on the knowledge about spams and expects to propose an efficient spam filtering method with the ability of adapting to the dynamic environment. We focus on the analysis of email's header and apply decision tree data mining technique to look for the association rules about spams. Then, we propose an efficient systematic filtering method based on these asso...
متن کاملSpam Filtering Using Statistical Data Compression Models
Spam filtering poses a special problem in text categorization, of which the defining characteristic is that filters face an active adversary, which constantly attempts to evade filtering. Since spam evolves continuously and most practical applications are based on online user feedback, the task calls for fast, incremental and robust learning algorithms. In this paper, we investigate a novel app...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014